Amazon DocumentDB
Amazon DocumentDB is a scalable, fully managed document database service that supports MongoDB workloads. It is designed to handle JSON data, making it ideal for use cases that involve content management, catalogs, user profiles, and mobile applications. DocumentDB is highly available, automatically replicating data across multiple Availability Zones.
Key Features
- MongoDB Compatibility: Amazon DocumentDB is compatible with the MongoDB API, enabling seamless migration of MongoDB workloads with minimal changes to the application code.
- Fully Managed: As a fully managed service, DocumentDB handles database management tasks such as provisioning, patching, backup, recovery, and scaling.
- Scalability: DocumentDB can scale read capacity by adding read replicas and scales storage automatically up to 64 TB without downtime.
- High Availability: DocumentDB provides multi-AZ deployment with automated failover and continuous backup to Amazon S3 with point-in-time recovery.
- Performance: Optimized for performance, DocumentDB separates compute and storage, allowing each to scale independently.
- Security: Provides encryption at rest and in transit, along with integration with AWS Identity and Access Management (IAM) for fine-grained access control.
Common Use Cases
- Content Management: Ideal for managing unstructured data like blogs, articles, and documents that are stored as JSON.
- Catalogs: Use DocumentDB to store and manage catalogs, such as product catalogs, with flexible schema and easy querying.
- User Profiles: DocumentDB is well-suited for storing user profile data, where the schema might evolve over time.
- Mobile Applications: Use DocumentDB to back mobile applications that require flexible schema and fast access to user data.
- Internet of Things (IoT): Store and manage large volumes of time-series data generated by IoT devices in a scalable, efficient way.
Architecture Overview
The following diagram illustrates the architecture of Amazon DocumentDB:
- Storage Layer: DocumentDB's storage layer is distributed, fault-tolerant, and self-healing, automatically replicating data across multiple Availability Zones.
- Compute Layer: The compute layer, which handles query processing, is separated from the storage layer, enabling independent scaling of compute and storage resources.
- Backup and Recovery: Continuous backup to Amazon S3, with point-in-time recovery, ensures that your data is protected and can be restored to any point in time.
- Read Replicas: DocumentDB supports adding read replicas to scale read operations and improve the availability of the database.
Integration with Other AWS Services
Amazon DocumentDB integrates with several AWS services to provide a comprehensive document database solution:
- AWS Lambda: Build serverless applications that interact with your DocumentDB database using AWS Lambda.
- Amazon CloudWatch: Monitor performance and operational metrics for your DocumentDB clusters using Amazon CloudWatch.
- Amazon VPC: Run DocumentDB in a VPC to isolate your database within your virtual network and control access using security groups.
- AWS DMS: Migrate existing MongoDB databases to DocumentDB using AWS Database Migration Service (DMS) with minimal downtime.
- Amazon S3: Use S3 for continuous backups and store database snapshots, ensuring your data is always protected.
Things to Remember for the Exam
- MongoDB Compatibility: Remember that DocumentDB is compatible with MongoDB, but some advanced MongoDB features may not be supported.
- Backup and Recovery: Know how DocumentDB handles backups and how to perform point-in-time recovery.
- Scalability: Understand how DocumentDB scales compute and storage independently, and how to use read replicas to scale read capacity.
- High Availability: Review the high availability features of DocumentDB, including multi-AZ replication and automated failover.
- Security: Be familiar with how DocumentDB integrates with IAM for access control and supports encryption at rest and in transit.
- Use Cases: Study the common use cases for DocumentDB, particularly in scenarios that involve unstructured or semi-structured data like JSON.